Skip to content

fix(sqlite_writer): skip index cells that exceed a single page#303

Merged
DeusData merged 1 commit intoDeusData:mainfrom
jjoos:fix/sqlite-index-btree-overflow
May 10, 2026
Merged

fix(sqlite_writer): skip index cells that exceed a single page#303
DeusData merged 1 commit intoDeusData:mainfrom
jjoos:fix/sqlite-index-btree-overflow

Conversation

@jjoos
Copy link
Copy Markdown

@jjoos jjoos commented Apr 30, 2026

Problem

write_index_btree() crashes with SIGBUS when indexing large repositories (observed on a Ruby codebase with ~215K nodes, specifically the HackerOne core monolith with ~67K functions).

The crash occurs in the page-buffer writer when an index cell's payload (qualified_name + file_path + properties) exceeds the SQLite page size. The code flushes the current page and assumes the fresh page will always accept the cell — but oversized cells still fail pb_cell_fits() after the flush. The subsequent pb_add_cell() causes a content_offset underflow, corrupting the page header and triggering SIGBUS on the next write.

Reproduction

Index any repository with functions whose qualified names + file paths exceed ~4000 bytes combined (the SQLite page size minus overhead). A Ruby monolith with deeply nested namespaces and long file paths reliably triggers this.

codebase-memory-mcp index --path /path/to/large-ruby-repo
# → SIGBUS in write_index_btree at sqlite_writer.c:1218

Fix

After pb_promote_and_flush(), re-check pb_cell_fits(). If the cell still doesn't fit on an empty page, continue past it. Index entries whose keys exceed a full page cannot be stored in a leaf and are silently dropped — the rest of the index is correctly preserved.

This is the index-btree counterpart to PR #175 (record overflow pages), which fixed the same shape of bug for the record-btree writer. The record side uses overflow pages; the index side has no overflow mechanism, so skipping is the correct behavior.

Changes

  • internal/cbm/sqlite_writer.c: 10 lines added, 3 removed in write_index_btree().

Testing

  • Verified by indexing a 215K-node Ruby repo that previously crashed — completes successfully with all other nodes indexed.
  • The skipped cells are edge cases (extremely long qualified names) that don't affect query correctness for normal usage.

Related

write_index_btree() flushed the page buffer whenever the next cell
didn't fit, on the assumption that a freshly-initialised page would
always accept any cell. That invariant fails for index cells whose
payload (qualified_name, file_path, etc.) is larger than a full
SQLite page — after the flush, pb_cell_fits() is still false but the
code calls pb_add_cell() anyway. The subsequent content_offset
underflow corrupts the page header and the next write triggers
SIGBUS on large repos (observed on a Ruby codebase with ~215K nodes).

Re-check pb_cell_fits() after the flush and continue past cells
that still don't fit. Index entries whose keys exceed a full page
can't be stored and are not expected to survive the writer anyway —
the rest of the index is correctly preserved and the indexer no
longer crashes. Record overflow pages (DeusData#175) already covers the
record-btree side of this same shape of bug.
@DeusData DeusData added bug Something isn't working stability/performance Server crashes, OOM, hangs, high CPU/memory labels May 4, 2026
@DeusData DeusData merged commit 98d5cac into DeusData:main May 10, 2026
@DeusData
Copy link
Copy Markdown
Owner

Merged via rebase, thanks @jjoos. The diagnosis is dead on — the asymmetry between the record-btree path (overflow pages, fixed in #175) and the index-btree path (no overflow mechanism) is exactly why the same shape of guard produced two different bug surfaces, and the fix preserves SQLite's invariants without inventing a new overflow scheme.

Following up on main with a small enhancement: a cbm_log_warn("sqlite_writer.index_cell_oversize", ...) on the skip path so future occurrences (extra-deep namespaces, very long paths) are observable rather than silent. Doesn't change the fix's behavior — just makes the dropped cells visible in logs for diagnosis.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working stability/performance Server crashes, OOM, hangs, high CPU/memory

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants